Symbolic formatted pretty-printing #615

rbonichon · 2016-06-13T17:18:26Z

This is based on PR #595. This allows pretty-printing without low-level output: the pretty-printing engine outputs symbolic pretty-printing items in a specific symbolic buffer.

At the end of pretty-printing, the contents of the symbolic buffer can post-processed (and modified) to really output characters on the low-level output device. In particular, it subsumes PR #506 since it gives the user complete control of flushes and low-level output, still preserving the compositional and incremental nature of Format-based pretty-printing .

This is joint work with @pierreweis

bobot · 2016-06-13T17:42:32Z

stdlib/format.mli

+  box fits on the current line, otherwise every break hint splits the line,
+- within an {e compacting} box, a break hint never splits the line, unless
+  there is no more room on the current line.
+


You should stress that the break hints are the one 'directly within' the box not the one inside inner boxes.

You're right, this is often unclear for many beginners. I proposed to
rephrase as follows:

Each different pretty-printing box kind introduces a specific line splitting
policy:

within an {e horizontal} box, break hints never split the line (but the line
may be split in a box nested deeper),

within a {e vertical} box, break hints always split the line,

within an {e horizontal/vertical} box, if the box fits on the current line
then break hints never split the line, otherwise break hint always split the
line,

within an {e compacting} box, a break hint never splits the line,
unless there is no more room on the current line.

Note that line splitting policy is box specific: the policy of a box does not
rule the policy of inner boxes. For instance, if a vertical box is nested in
an horizontal box, all break hints within the vertical box will split the
line.

-(** {6 Boxes} )
+(* {6 Pretty-printing boxes} )
+
+(* The pretty-printing engine uses the concepts of pretty-printing box and

break hint to drive the indentation and the line splitting behavior of the

pretty-printer.

Each different pretty-printing box kind introduces a specific line splitting

policy:
+- within an {e horizontal} box, there is no line splitting,
+- within a {e vertical} box, every break hint splits the line,
+- within an {e horizontal/vertical} box there is no line splitting if the

box fits on the current line, otherwise every break hint splits the line,
+- within an {e compacting} box, a break hint never splits the line, unless

there is no more room on the current line.

You should stress that the break hints are the one 'directly within' the box not the one inside inner boxes.

You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
https://github.com/ocaml/ocaml/pull/615/files/d49e17f82307d7cf396cb6cc27b9fa5ddff0cefd#r66835044

alainfrisch · 2016-06-14T07:00:26Z

The new field for formatter_out_functions will break any code that creates such a record "from scratch" (i.e. without overriding some fields of an existing record). At least for the future, it would be nice to expose a builder function so that the record can be further extended without breaking client code (using optional arguments for new fields).

pierreweis · 2016-06-15T16:57:07Z

Alain, yes indeed it is conceivable that someone does build a
formatter_out_functions record from scratch, but this is really not probable,
since the easy way to modify the record is indeed to use a with clause.
In any case, modifying such an hypothetical piece of code will be easy using
a with clause and the resulting code will be even clearer.

Also note that this record has not been extended for decades and the new
field is the last semantic action that the pretty-printer indeed performs
that was not part of the output function record.

I think the best way to achieve a future proof API would be to abstract the
record and only provide functions to modify an existing record. But this
would break even more existing code.

The new field for formatter_out_functions will break any code that creates such a record "from scratch" (i.e. without overriding some fields of an existing record). At least for the future, it would be nice to expose a builder function so that the record can be further extended without breaking client code (using optional arguments for new fields).

You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub:
#615 (comment)

alainfrisch · 2016-06-15T17:19:22Z

I've checked LexiFi's code base. We have three occurrences where we build a custom formatter_out_functions record. Two of them list explicitly all the fields; one uses the record override form. Actually, this last form would be the trickiest one, since it should semantically be adapted to override out_indent as well but the compiler would not complain. This could lead to subtle pretty-printing bugs. The other two places are indeed trivial to fix, but this makes it impossible to have code compatible with previous versions.

We are generally very conservative with breaking changes in the stdlib (more than I usually wish). It would be worth checking on public OPAM packages to find out the extent of the breakage.

Can you elaborate on how this change is related to the title of the proposal? Without looking at the details, it seems to me that making a distinction between ident and spaces is not strictly related to that, or is it?

I think the best way to achieve a future proof API would be to abstract the record and only provide functions to modify an existing record. But this would break even more existing code.

Well, it seems almost any code that manipulates formatter_out_function would need to be adapted anyway; so perhaps it is best breaking compatibility frankly, if only to prevent silent bugs (as described above).

One could also provide a builder function (easy to extend with optional arguments) and mark the concrete record as being deprecated, without actually dropping it.

rbonichon · 2017-02-07T16:31:36Z

@alainfrisch I have run an over-approximated analysis over all opam packages for 4.03.
Only 3 packages reference the record fields and only 1 constructs it directly (utop). The very same construction appears in OCaml's testsuite and is fixed in #595.

See the details in log_analysis.txt

- Typos - English - Extended documentation for semantic tags

Before this commit, indentation was identical to outputting blanks. This is generally true but not always: in some applications you may want to use '\t' as indentation but not for blanks, ... The `formatter_out_functions` record is extended with an extra settable field `out_indent`. This allows the user to control this aspect of pretty-printing. The teststuite has been update to reflect this modification.

This function is broken. Before you use this function, you might be in either one of these cases: 1. There are no pending open boxes/tags. In this case, flushing the stack has no effect. 2. You still have some open boxes and/or tags. Thus the function call flushes them. Now, if you still had some pretty-printing to do afterwards, it makes no sense to have called this function. Say for the sake of the argument that only 1 box was left open before the call: - Either the rest (e.g. continuation) of your pretty-printing routine assumed that the box was *closed*: now it is broken. - Or it *knew* the box was open. The question is now why does it not close it itself (or pass this obligation to yet another part of the pretty-printing routine) ? In either case, this function should *not* be used and thus it should not be exposed to the user. Let us state here that it breaks the invariants upon which the Format module has been based.

In this function, we repeatedly close the open tags. This function is now called when resetting the pretty-printing engine. This is a fix: before this commit, tags were never closed when flushing the pretty-printing engine. Actually, tags were *only* removed from the the tag stack. Detail: opened -> open.

- Some addition for formatted pretty-printing. - Some simplification and precision; in particular, explicitely naming the standard pretty-printer when we are talking about the standard pretty-printer

gasche · 2017-10-21T16:16:36Z

Update: this message is outdated (see Octachron's comment below).

@pierreweis, @rbonichon: this change lacks a Changes entry, so it is not advertised to users of the release -- and they don't know where to look for a reference to this PR if they want to understand the design rationale, etc.

Would it be possible for you to submit a Changes entry (as a new PR), following the CONTRIBUTING.md#changelog guidelines?

At least the following people should be mentioned in the "review by" credits: Gabriel Radanne and Florian Angeletti (that is, Drup and octachron, the first did a review of the patch and the second fixed since-tags and documentation issues after the fact).

Given that we are planning to release by the end of the month, it would be very nice if you could do this (small thing) next week. If you think you will not have the time, please let us know as soon as possible, and @Octachron or myself will write something.

Octachron · 2017-10-21T19:26:43Z

@gasche , in fact I had already added a Change entry for the symbolic printer feature
https://github.com/ocaml/ocaml/blob/trunk/Changes#L135 ,
which even contained a short mention of the formatter_out_functions change,
but I had somehow missed the link with #595 and Drup's review.
Sorry for the resulting confusion.

Nevertheless, it would be probably clearer to split the change entry in two: one entry for the symbolic printer and another one fot the out_indent field.

gasche · 2017-10-31T17:45:38Z

One of the packages that breaks because of the new record field is notty : see pqwy/notty#17. I'm linking the issue here for reference.

pqwy · 2017-10-31T17:51:56Z

It would not break if not for this.

Having

let display_indent state = state.pp_out_spaces

would enable code to ignore the new feature while retaining the old behavior, which is presently impossible.

That is, the new operation should be defined in terms of the old operation it is partially replacing.

rbonichon · 2017-10-31T20:33:56Z

@pqwy This is exactly the proposal of #1382

There are also drawbacks to keeping the old behavior as you suggest, as I explain in the discussion of said PR. Basically indenting is a separate concept from outputting spaces (think of the tab vs space debate for indentation).

Adapt to breakage in ocaml/ocaml#615. Affects `I.strf`. If the background attribute leaks over the leading space on 4.06, this is it. Full fix requires dropping < 4.06 support. Fixes #17.

make depend

* Random reordering of CFG layout * Add a compiler flag -reorder-blocks-random seed and use it in asmgen

Co-authored-by: tmattio <tmattio@users.noreply.github.com>

bobot reviewed Jun 13, 2016
View reviewed changes

alainfrisch added the stdlib label Jun 20, 2016

rbonichon force-pushed the feature/symbolic_pp branch from 204dd3b to 1cd418c Compare August 23, 2016 15:31

rbonichon mentioned this pull request Dec 29, 2016

expose function to flush formatter's internal queue #506

Closed

rbonichon force-pushed the feature/symbolic_pp branch 2 times, most recently from 85faffa to 2b58802 Compare February 7, 2017 16:20

rbonichon mentioned this pull request Feb 7, 2017

Adding a new field to record formatter_out_functions to redefine the meaning of indentation #595

Closed

rbonichon force-pushed the feature/symbolic_pp branch from 2b58802 to 402b87d Compare February 7, 2017 17:05

damiendoligez added the approved label Mar 7, 2017

rbonichon and others added 5 commits April 3, 2017 15:38

[format] Documentation fixes and enhancements

12642c6

- Typos - English - Extended documentation for semantic tags

[format] Add symbolic pretty-printing

37b9f4c

rbonichon force-pushed the feature/symbolic_pp branch from 402b87d to d877637 Compare April 3, 2017 13:44

pierreweis and others added 2 commits April 3, 2017 15:44

[format] Documentation review and rephrasing.

4e695f6

- Some addition for formatted pretty-printing. - Some simplification and precision; in particular, explicitely naming the standard pretty-printer when we are talking about the standard pretty-printer

[format] Extend documentation with @bobot proposal

d877637

pierreweis merged commit d877637 into ocaml:trunk Apr 3, 2017

rgrinberg mentioned this pull request Apr 3, 2017

Upcoming breakage ocaml-community/utop#197

Closed

Octachron mentioned this pull request Aug 13, 2017

Documentation: fix minor issues within Format #1290

Merged

nojb mentioned this pull request Sep 30, 2017

Clarify documentation for Format.out_indent #1382

Merged

pqwy added a commit to pqwy/notty that referenced this pull request Oct 31, 2017

ocaml 4.06

e81cab0

Adapt to breakage in ocaml/ocaml#615. Affects `I.strf`. If the background attribute leaks over the leading space on 4.06, this is it. Full fix requires dropping < 4.06 support. Fixes #17.

dbuenzli mentioned this pull request May 26, 2018

Functorize the generic format part. dbuenzli/fmt#32

Closed

keleshev mentioned this pull request Aug 24, 2018

Format.pp_print_custom_break, a more general break hint #2002

Merged

EduardoRFS pushed a commit to esy-ocaml/ocaml that referenced this pull request Jul 23, 2021

Merge pull request ocaml#615 from ocaml-multicore/kayceesrk/depend

fc5ec5a

make depend

stedolan pushed a commit to stedolan/ocaml that referenced this pull request May 24, 2022

Compiler flag -reorder-blocks-random for FDO testing (ocaml#615)

a6a5c49

* Random reordering of CFG layout * Add a compiler flag -reorder-blocks-random seed and use it in asmgen

EmileTrotignon pushed a commit to EmileTrotignon/ocaml that referenced this pull request Jan 12, 2024

[create-pull-request] automated change (ocaml#615)

6f71f23

Co-authored-by: tmattio <tmattio@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Symbolic formatted pretty-printing #615

Symbolic formatted pretty-printing #615

rbonichon commented Jun 13, 2016

bobot Jun 13, 2016

pierreweis Jun 15, 2016

alainfrisch commented Jun 14, 2016

pierreweis commented Jun 15, 2016

alainfrisch commented Jun 15, 2016

rbonichon commented Feb 7, 2017

gasche commented Oct 21, 2017 •

edited

Octachron commented Oct 21, 2017

gasche commented Oct 31, 2017

pqwy commented Oct 31, 2017 •

edited

rbonichon commented Oct 31, 2017

Symbolic formatted pretty-printing #615

Symbolic formatted pretty-printing #615

Conversation

rbonichon commented Jun 13, 2016

bobot Jun 13, 2016

Choose a reason for hiding this comment

pierreweis Jun 15, 2016

Choose a reason for hiding this comment

alainfrisch commented Jun 14, 2016

pierreweis commented Jun 15, 2016

alainfrisch commented Jun 15, 2016

rbonichon commented Feb 7, 2017

gasche commented Oct 21, 2017 • edited

Octachron commented Oct 21, 2017

gasche commented Oct 31, 2017

pqwy commented Oct 31, 2017 • edited

rbonichon commented Oct 31, 2017

gasche commented Oct 21, 2017 •

edited

pqwy commented Oct 31, 2017 •

edited